---
title: "JSONConverter"
id: jsonconverter
slug: "/jsonconverter"
description: "Converts JSON files to text documents."
---

# JSONConverter

Converts JSON files to text documents.

|  |  |
| --- | --- |
| **Most common position in a pipeline** | Before [PreProcessors](../preprocessors.mdx) , or right at the beginning of an indexing pipeline |
| **Mandatory init variables** | ONE OF, OR BOTH:  <br /> <br />"jq_schema": A jq filter string to extract content  <br /> <br />"content_key": A key string to extract document content |
| **Mandatory run variables** | "sources": A list of file paths or [ByteStream](../../concepts/data-classes.mdx#bytestream) objects |
| **Output variables** | "documents": A list of documents |
| **API reference** | [Converters](/reference/converters-api) |
| **GitHub link** | https://github.com/deepset-ai/haystack/blob/main/haystack/components/converters/json.py |

## Overview

`JSONConverter` converts one or more JSON files into a text document.

### Parameters Overview

To initialize `JSONConverter`, you must provide either `jq_schema`, or `content_key` parameter, or both.

`jq_schema` parameter filter extracts nested data from JSON files. Refer to the [jq documentation](https://jqlang.github.io/jq/) for filter syntax. If not set, the entire JSON file is used.

The `content_key` parameter lets you specify which key in the extracted data will be the document's content.

- If both `jq_schema` and `content_key` are set, the `content_key` is searched in the data extracted by `jq_schema`. Non-object data will be skipped.
- If only `jq_schema` is set, the extracted value must be scalar; objects or arrays will be skipped.
- If only `content_key` is set, the source must be a JSON object, or it will be skipped.

Check out the [API reference](../converters.mdx) for the full list of parameters.

## Usage

You need to install the `jq` package to use this Converter:

```shell
pip install jq
```

### Example

Here is an example of simple component usage:

```python
import json

from haystack.components.converters import JSONConverter
from haystack.dataclasses import ByteStream

source = ByteStream.from_string(json.dumps({"text": "This is the content of my document"}))

converter = JSONConverter(content_key="text")
results = converter.run(sources=[source])
documents = results["documents"]
print(documents[0].content)
## 'This is the content of my document'
```

In the following more complex example, we provide a `jq_schema` string to filter the JSON source files and `extra_meta_fields` to extract from the filtered data:

```python
import json

from haystack.components.converters import JSONConverter
from haystack.dataclasses import ByteStream

data = {
  "laureates": [
    {
      "firstname": "Enrico",
      "surname": "Fermi",
      "motivation": "for his demonstrations of the existence of new radioactive elements produced "
      "by neutron irradiation, and for his related discovery of nuclear reactions brought about by"
      " slow neutrons",
    },
    {
      "firstname": "Rita",
      "surname": "Levi-Montalcini",
      "motivation": "for their discoveries of growth factors",
    },
  ],
}
source = ByteStream.from_string(json.dumps(data))
converter = JSONConverter(
  jq_schema=".laureates[]", content_key="motivation", extra_meta_fields={"firstname", "surname"}
)

results = converter.run(sources=[source])
documents = results["documents"]
print(documents[0].content)
## 'for his demonstrations of the existence of new radioactive elements produced by
## neutron irradiation, and for his related discovery of nuclear reactions brought
## about by slow neutrons'

print(documents[0].meta)
## {'firstname': 'Enrico', 'surname': 'Fermi'}

print(documents[1].content)
## 'for their discoveries of growth factors'

print(documents[1].meta)
## {'firstname': 'Rita', 'surname': 'Levi-Montalcini'}
```
